Evolving Guide Trees in Progressive Multiple Sequence Alignment
نویسندگان
چکیده
We present a novel application of genetic algorithms to the problem of aligning multiple biological sequences through the optimization of guide trees. Individual guide trees are represented as coalescing binary trees which provide for efficient and meaningful crossover and mutation operations. We hypothesize that our technique avoids the limitations of other heuristic tree-building techniques, and is therefore capable of producing better trees, as measured by higher quality alignments. Further, our approach is more scalable than commonly used progressive alignment techniques when aligning large datasets.
منابع مشابه
The Effect of the Guide Tree on Multiple Sequence Alignments and Subsequent Phylogenetic Analysis
Many multiple sequence alignment methods (MSAs) use guide trees in conjunction with a progressive alignment technique to generate a multiple sequence alignment but use differing techniques to produce the guide tree and to perform the progressive alignment. In this paper we explore the consequences of changing the guide tree used for the alignment routine. We evaluate four leading MSA methods (P...
متن کاملThe effect of the guide tree on multiple sequence alignments and subsequent phylogenetic analyses.
Many multiple sequence alignment methods (MSAs) use guide trees in conjunction with a progressive alignment technique to generate a multiple sequence alignment but use differing techniques to produce the guide tree and to perform the progressive alignment. In this paper we explore the consequences of changing the guide tree used for the alignment routine. We evaluate four leading MSA methods (P...
متن کاملApplication of the MAFFT sequence alignment program to large data—reexamination of the usefulness of chained guide trees
MOTIVATION Large multiple sequence alignments (MSAs), consisting of thousands of sequences, are becoming more and more common, due to advances in sequencing technologies. The MAFFT MSA program has several options for building large MSAs, but their performances have not been sufficiently assessed yet, because realistic benchmarking of large MSAs has been difficult. Recently, such assessments hav...
متن کاملSimple chained guide trees give high-quality protein multiple sequence alignments.
Guide trees are used to decide the order of sequence alignment in the progressive multiple sequence alignment heuristic. These guide trees are often the limiting factor in making large alignments, and considerable effort has been expended over the years in making these quickly or accurately. In this article we show that, at least for protein families with large numbers of sequences that can be ...
متن کاملA comparative analysis of progressive multiple sequence alignment approaches using UPGMA and neighbor joining based guide trees
Multiple sequence alignment is increasingly important to bioinformatics, with several applications ranging from phylogenetic analyses to domain identification. There are several ways to perform multiple sequence alignment, an important way of which is the progressive alignment approach studied in this work. Progressive alignment involves three steps: find the distance between each pair of seque...
متن کامل